Search Results: "Jakub Wilk"

26 September 2014

Jakub Wilk: Pet peeves: debhelper build-dependencies (redux)

$ zcat Sources.gz   grep -o -E 'debhelper [(]>= 9[.][0-9] ,7 ([^0-9)][^)]*)?[)]'   sort   uniq -c   sort -rn
    338 debhelper (>= 9.0.0)
     70 debhelper (>= 9.0)
     18 debhelper (>= 9.0.0~)
     10 debhelper (>= 9.0~)
      2 debhelper (>= 9.2)
      1 debhelper (>= 9.2~)
      1 debhelper (>= 9.0.50~)
Is it a way to protest against the current debhelper's version scheme?

4 September 2014

Jakub Wilk: Joys of East Asian encodings

In i18nspector I try to support all the encodings that were blessed by gettext, but it turns out to be more difficult than I anticipated:
$ roundtrip()   c=$(echo $1   iconv -t $2); printf '%s -> %s -> %s\n' $1 $c $(echo $c   iconv -f "$2");  
$ roundtrip   EUC-JP
  -> \ -> \
$ roundtrip   SHIFT_JIS
  -> \ ->  
$ roundtrip   JOHAB
  -> \ ->  
Now let's do the same in Python:
$ python3 -q
>>> roundtrip = lambda s, e: print('%s -> %s -> %s' % (s, s.encode(e).decode('ASCII', 'replace'), s.encode(e).decode(e)))
>>> roundtrip(' ', 'EUC-JP')
  -> \ -> \
>>> roundtrip(' ', 'SHIFT_JIS')
  -> \ -> \
>>> roundtrip(' ', 'JOHAB')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
UnicodeEncodeError: 'johab' codec can't encode character '\u20a9' in position 0: illegal multibyte sequence
So is 0x5C a backslash or a yen/won sign? Or both? And what if 0x5C could be a second byte of a two-byte character? What could possibly go wrong?

29 August 2014

Jakub Wilk: More spell-checking

Have you ever wanted to use Lintian's spell-checker against arbitrary files? Now you can do it with spellintian:
$ zrun spellintian --picky /usr/share/doc/RFC/best-current-practice/rfc*
/tmp/0qgJD1Xa1Y-rfc1917.txt: amoung -> among
/tmp/kvZtN435CE-rfc3155.txt: transfered -> transferred
/tmp/o093khYE09-rfc3481.txt: unecessary -> unnecessary
/tmp/4P0ux2cZWK-rfc6365.txt: charater -> character
mwic (Misspelled Words In Context) takes a different approach. It uses classic spell-checking libraries (via Enchant), but it groups misspellings and shows them in their contexts. That way you can quickly filter out false-positives, which are very common in technical texts, using visual grep:
$ zrun mwic /usr/share/doc/debian/social-contract.txt.gz
DFSG:
   an Free Software Guidelines (DFSG)
   an Free Software Guidelines (DFSG) part of the
                                ^^^^
Perens:
     Bruce Perens later removed the Debian-spe 
  by Bruce Perens, refined by the other Debian 
           ^^^^^^
Ean, Schuessler:
  community" was suggested by Ean Schuessler. This document was drafted
                              ^^^ ^^^^^^^^^^
GPL:
  The "GPL", "BSD", and "Artistic" lice 
       ^^^
contrib:
  created "contrib" and "non-free" areas in our 
           ^^^^^^^
CDs:
  their CDs. Thus, although non-free wor 
        ^^^

21 February 2014

Jakub Wilk: For those who care about snowclones

Instances of the for those who care about X snowclone on Debian mailing lists:

25 December 2013

Jakub Wilk: dput

My Internet connection is too flaky to use dput(-ng) reliably, so I use this tiny replacement instead:
#!/bin/sh
dcmd rsync --chmod=0644 -P "$@" ssh.upload.debian.org:/srv/upload.debian.org/UploadQueue/

5 December 2013

Jakub Wilk: A-za-z a-zA-z

Releasing the shift key is hard.

4 November 2013

Jakub Wilk: ~/.netrc security

TL;DR: don't put valuable passwords in ~/.netrc In the olden days, the ~/.netrc file was used for storing FTP usernames and passwords. These days we have clients of other protocols that use said file. Perhaps your IMAP or SMTP client use it. So you put your e-mail accounts password into ~/.netrc, and then meticulously configured the clients to always connect via TLS and to verify server certificates. You feel secure. But you shouldn't. Here's how an attacker capable of MiTM can exploit wget to steal ~/.netrc passwords: 1) Alice tries to download a file over HTTP:
$ wget http://xkcd.com/538/
2) Eve takes over the HTTP connection, sending a redirection response:
HTTP/1.1 303 See Other
Location: http://supersecuremail.example.net/
3) Alice's wget follows the redirection. 4) Eve takes over the connection to supersecuremail.example.net, requesting password authentication:
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="moo"
5) Alice's wget sends the supersecuremail.example.net password straight to Eve.

18 August 2013

Ben Armstrong: Taskwarrior new blog, getting involved

The Taskwarrior Team has just started a blog, kicking it off with a series of articles about development of Taskwarrior itself. Go, team! Almost from the moment I started using Taskwarrior (thanks to Jakub WIlk for an excellent job maintaining this) I knew I had finally found the todo system that I could love. Right away, I started hanging out with the Taskwarrior community at #taskwarrior @ irc.freenode.net and found out what an awesome bunch of people they are, both developers and users alike. I have plunged in with bug reports and feature requests, and am helping get more Taskwarrior-related things into Debian. In NEW right now I have uploaded several of the dependencies needed by my ITP of taskwarrior-web. It s looking like I ll have that finished later this month or early next month. Also, I have uploaded a wheezy backport of Taskwarrior itself (aka task ) which, if all goes well, enters the archive next week.

20 April 2013

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

Ulrich Dangel: Analyzing rc bug messages

Michael Stapelberg recently posted a blog post about looking into the number of Debian Developers actively working on RC bugs for the upcoming wheezy release. In this blog post I analyze the data shared by Michael and provide the R commands used to generate the plots & findings. If you are interested into looking into the data yourself, but don t like R, I suggest using ipython notebook + numpy instead.

Analysis After parsing the data file we typically want to get an understanding of the data, by using summary(bugs) we get the minimum(1), median(5), mean(15.4), max(716) and quantiles of the data. This shows that the number of messages is wide-spread and a few people contribute a lot. To visualize the dispersion of the data we can create a box plot showing the range of messages: boxplot As the first and third quantile are close together we can assume that the majority of the work is done by a few, especially since the second quantile is 5. This is supported by the histogram below, where the x axis is the number of recorded messages and y is the number of developers. histogram

Top 10 contributors The TOP 10 contributors, according to the dataset, are:
  1. Lucas Nussbaum - 716 messages
  2. Gregor Herrmann - 270 messages
  3. Jakub Wilk - 270 messages
  4. Andreas Beckmann - 225 messages
  5. Julien Cristau - 205 messages
  6. Cyril Brulebois - 169 messages
  7. Moritz Muehlenhoff - 162 messages
  8. Michael Biebl - 159 messages
  9. Salvatore Bonaccorso - 158 messages
  10. Christoph Egger - 142 messages

r commands These are the commands used to generate the plots and information in this plot:
bugs <- read.csv("by-msg.csv")
summary(bugs)
boxplot(bugs$rcbugmsg, log='y', range=0, ylab="# bugs")
quantile(bugs$rcbugmsg)
0%  25%  50%  75% 100%
1    2    5   12  716
# create histogram
llibrary('ggplot2')
ggplot(bugs, aes(x=rcbugmsg)) + geom_histogram(binwidth=.5, colour="black", fill="black") + scale_x_sqrt()
top10 <- tail(bugs[order(bugs$rcbugmsg),], 10)
top10

10 April 2013

Paul Wise: Inadequate software

Just 168 of the 4961 packages (3%) I have installed are inadequate. Unfortunately those packages collectively have 3440 inadequacies. How much of the software on your system has these inadequacies? You can find out today by installing Jakub Wilk's software, which is appropriately named adequate. It is now available in Debian unstable. I recommend enabling the apt hook which notifies you when software you are installing is inadequate. Other ways of being notified when you are installing inadequate software include apt-listbugs and debsecan. If you are interested in software quality, Debian's QA activities wiki page provides a good overview of the quality assurance activities that are being worked on within the context of Debian. If you want to provide better quality software for Debian, please keep an eye on the PTS pages for software you maintain. You can also run various automated checks on your software before you make new releases or upload them to the Debian archive. More people are needed to improve and expand upon Debian's existing quality assurance activities and infrastructure. Come join us today!

12 January 2013

Jakub Wilk: Internationalization environment variables

Setting internationalization environment variables is a bit tricky. For example, this:
$ LANG=sv_SE.UTF-8 stat /nonexistent
may look like a way to make stat(1) print the error message in Swedish. Yet there are many ways it could go wrong:
  1. LC_MESSAGES could be set in the environment, overriding LANG.
  2. LC_ALL could be set, overriding both LC_MESSAGES and LANG.
  3. LANGUAGE could be set, overriding LC_ALL, LC_MESSAGES, and LANG. For extra complexity, LANGUAGE has no effect if LC_MESSAGES is effectively set to C. (Also, this variable is a GNUism.)
  4. The locale could be simply missing from the system.
To make these things a little less intricate, I wrote localehelper. It's a bit like env(1), but it takes care of: This does the right thing:
$ localehelper LANG=sv_SE.UTF-8 stat /nonexistent
stat: kan inte ta status p   /nonexistent : Filen eller katalogen finns inte

5 January 2013

Paul Tagliamonte: Updates to dput-ng since version 1.0

Big release notes since 1.0: We ve got a new list dput-ng-maint@lists.alioth.debian.org feel free to subscribe!
1.3:
  * Avoid failing on upload if a pre/post upload hook is missing from the
    Filesystem.
  * Fix "dcut raises FtpUploadException" by correctly initializing the uploader
    classes from dcut (Closes: #696467)
1.2:
  * Add bash completions for dput-ng (Closes: #695412).
  * Add in a script to set the default profile depending on the building
    distro (Ubuntu support)
  * Fix a bug where meta-class info won't be loaded if the config file has the
    same name.
  * Add an Ubuntu upload target.
  * Added .udeb detection to the check debs hook.
  * Catch the correct exception falling out of bin/dcut
  * Fix the dput manpages to use --uid rather then the old --dm flag.
  * Fix the CLI flag registration by setting required=True
    in cancel and upload.
  * Move make_delayed_upload above the logging call for sanity's sake.
  * Fix "connects to the host even with -s" (Closes: #695347)
Thanks to everone who s contributed!
     7  Bernhard R. Link
     4  Ansgar Burchardt
     3  Luca Falavigna
     2  Michael Gilbert
     2  Salvatore Bonaccorso
     1  Benjamin Drung
     1  Gergely Nagy
     1  Jakub Wilk
     1  Jimmy Kaplowitz
     1  Luke Faraone
     1  Sandro Tosi
This has been your every-once-in-a-while dput-ng update. We re looking for more code contributions (to make sure everyone s happy), doc updates (etc) or ideas.

19 November 2012

Vasudev Kamath: Weekly Log - 17/23 - 112012

The last week was quite productive as I was on vacation and at home town but sadly I couldn't complete this post so again merging the work with this week but this week ain't much productive as I was tired from journey back and didn't get enough time to recover. So here it goes. Debian Related Upstream Related I raised a pull request #8 on Gubbi fixing the Makefile to more organized and introducing xz compression in it. Additionally I removed distribution specific stuffs from Makefile and made it generic. That's all for the two weeks. This week there will be foss.in and we will be having some Debian specific mini conference, including some Debian basics to newbies and some bug squashing if any :-). So more to report next week, till then Cya.

11 November 2012

Jakub Wilk: All those translations, how did you ever manage it?

I've just released initial version of gettext-inspector, a tool for checking gettext PO/POT/MO files. While it's in an early stage of development, it's already able to detect wide rage of problems. For example, this is what it emits on my system:
$ gettext-inspector /usr/share/locale/*/LC_MESSAGES/*.mo   cut -d ' ' -f 1,3   sort   uniq -c   sort -rn
   1902 P: no-language-header-field
   1601 P: no-version-in-project-id-version
   1372 W: no-report-msgid-bugs-to-header-field
    273 P: invalid-content-transfer-encoding
    201 W: invalid-date
     78 W: syntax-error-in-plural-forms
     77 I: no-package-name-in-project-id-version
     63 W: boilerplate-in-report-msgid-bugs-to
     50 W: language-disparity
     47 I: unknown-header-field
     38 W: invalid-language
     25 W: boilerplate-in-project-id-version
     10 I: unknown-poedit-language
      8 W: no-date-header-field
      5 W: no-project-id-version-header-field
      5 W: c1-control-characters
      3 P: no-mime-version-header-field
      3 P: no-content-transfer-encoding-header-field
      2 W: non-portable-encoding
      1 W: invalid-report-msgid-bugs-to
      1 W: ancient-date
      1 I: unable-to-determine-language

28 October 2012

Gregor Herrmann: RC bugs 2012/43

here's the short overview about my activities around release-critical bugs for the last week:

13 October 2012

Vasudev Kamath: Weekly Log - 06/13-102012

Update for 06/10/2012 Well I had not written the weekly work log for the last week that is because the week was short (oh yeah thanks to all these bandhs our week got fluctuated and to be frank the week was only 4 days long) and second was my lazyness. Here goes the update * After a long discussion with Aravinda on "why productivity is reducing these days." We concluded the social network is eating up most of time. So decision was made and I closed all pinned tabs for Twitter Gmail Identi.ca and Friendica on my browser * After the above resolution I finished almost 4 chapters of Moder Perl within 2 hours!. Indeed Social networks kill the productivity. Update for 13/10/2012 I've really fallen in love with computerised bots, thanks to the wonderful KGB bot :-). So majority of my work on this week is on bots. Jabber Dictionary Redesign After thinking for a while I decided to re-write the dictionary bot which when release got an overwhelming response as see in the comments of above link. Few reason for re-write
  1. Generalise the bot framework so single bot can handle multiple languages
  2. Improve the data collection on bot side
  3. Current code base was not very well organised and trying to add more feature it had become messy.
  4. Provide XEP support to help in data collections
Few changes which are already implemented include
  1. New code is now using pyxmpp2 library instead of GPLed xmppy used by current code.
  2. Implemented XEP-0071 extension to properly format the meaning display by the bot
  3. Current implementation was displaying all words in one set without distinguishing between adjectives,verbs proverbs etc. even though wiktionary displays meaning based on this. New implementation gives out meaning in same format as it is displayed on wiktionary.
Things remaining todo
  1. Separating wiktionary parsing logic from bot code by providing some sort of intermediate interface between bot and wiktionary parsers.
  2. Adding more language wiktionary parsers and teaching the bot to become multilingual :-)
  3. Integrating XEP-0004 (Data Forms) for taking the meaning input from user. Current code requires user to enter data in particular format
suckless-tools fix for Wheezy Jakub Wilk suggested me to prepare a minimal version of suckless-tools for Wheezy which includes a patch to the bug #685611. And few minor changes which involve taking over the package and fixes in copyright file other than above mentioned bug. Hopefully release-team will be okay with these changes. I'm waiting for the upload to file an unblock-request. I did face a problem I was halfway through wit suckless-tools_39-1 when Jakub asked me about this change and current repository was fresh one prepared for 39-1 and didn't have history for 38 version. First I thought of preparing separate repository for 38 version which was not an correct option, but I even couldn't play with current repository. So finally I renamed current version of repository to suckless-tools-39.git on collab-maint and prepared fresh repository suckless-tools.git basing it on 38-1 version. From 39 version suckless-tools will be following 3.0 (quilt) source format and will not be working with git-buildpackage as the tool can't handle multi-tarball packages. Yes every tool involved in the package will have separate tarball from 39. More work on Bots Well today I again worked on Jabber bot, but not the dictionary bot. This time its an SMS Gateway bot for Jonas Smedegard and the coding was done in Perl. Thanks to jonas I finally could apply what I learnt in Perl. Code may not be very elegant but it works :-). And after hacking one full day in Perl now I'm not feeling very much interested to go back and hack on my own Python based dictionary bot :-). But I will anyway. Misc So that's it folks. Quite longish post hope you are not bored reading it :-). Well time for movie C'ya all with next weeks log.

26 August 2012

Gregor Herrmann: RC bugs 2012/34

good news: I'm seeing more & more people contributing to RC bugs in the BTS. here are my own contributions for the past week:

19 August 2012

Gregor Herrmann: RC bugs 2012/33

it's sunday evening again, & again, here comes my report about activities around RC bugs:

15 August 2012

Jakub Wilk: Spell-checker for irssi

I tried many spell-checkers for irssi, and they all sucked (and some of them were also completely insane). So I took the one that seemed least crazy and forked it.

Next.

Previous.